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1 Using a lookahead window in a compaction-based parallelizing compiler 
^ Toshio Nakatani, Kemal EbcioDlu 

January 1991 ACM SIGMICRO Newsletter, Volume 22 Issue 1 

Publisher: ACM Press 

Full text available:* ^ pdf(969.83 Additional Information: full citation , abst 

KB) index terms 

Lookahead is a common technique for high performance uniprocessor d< 
however, hardware lookahead window is too small to exploit instruction 
run time, while compaction-based parallelizing compilers must suffer frc 
exponential code explosion at compile time. In this paper, we propose a . 
method, which allows inter-basic block code motions within the prespec 
operations, called software lookahead window, ... 



2 Using a lookahead window in a compaction-based parallelizing compiler 
Toshio Nakatani, Kemal Ebcioglu 

November 1990 Proceedings of the 23rd annual workshop and symposi 

Microprogramming and microarchitecture MICRO 22 
Publisher: IEEE Computer Society Press 

Full text available: ^pdfd.l 1 Additional Information: full citation , abst 

MB) citings 
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Lookahead is a common technique for high performance uniprocessor d( 
however, hardware lookahead window is too small to exploit instruction 
run time, while compaction-based parallelizing compilers must suffer frc 
exponential code explosion at compile time. In this paper, we propose a ; 
method, which allows inter-basic block code motions within the prespec: 
operations, called software lo ... 

3 Automatic microcode generation for horizontally microprogrammed proce; 
^ Robert J. Sheraga, John L. Gieser 

December 1981 ACM SIGMICRO Newsletter , Proceedings of the 14th 
on Microprogramming MICRO 14, Volume 12 Issue 4 
Publisher: IEEE Press, ACM Press 

Full text available: ^pdfn.22 Additional Information: full citation , abst 

MB) citings , index ten 

A procedure is described which permits applications problems coded in j 
Language to be compiled to microcode for horizontally microprogrammi 
experimental language has been designed which is suitable for expressin 
oriented problems for such processors in a distributed processing enviroi 
programs are compiled first to a machine independent intermediate langi 
machine dependent form consisting of elementary microoperatio ... 

4 Maximal static expansion 

<^ Denis Barthou, Albert Cohen, Jean-Francois Collard 

January 1998 Proceedings of the 25th ACM SIGPLAN-SIGACT sympc 

of programming languages POPL f 98 
Publisher: ACM Press 

Full text available: * ^pdf(1.19 Additional Information: full citation , refe: 

MB) index terms 



Keywords: expansion of data structure, privatization, single assignment 

5 Views on transportability of Lisp and Lisp-based systems 
<^ Richard J. Fateman 

August 1981 Proceedings of the fourth ACM symposium on Symbolic i 



http://portal.acm.org/results.cfm?CFID=13590200&CFTOKEN=853... 2/5/07 



Results (page 1): +code +expansion +1/0 Page 3 of 9 

computation SYMSAC '81 
Publisher: ACM Press 

Full text available: ^ pdf(489.58 Additional Information: full citation , abst 

KB) citings , index ten 

The availability of new large-address-space computers has provided us a 
examine techniques for transferring programming systems, and in partici 
to new computers. We contrast two approaches: designing and building ; 
implementation of Lisp, and (re)writing the system in a "portable" progr 
('C). Our conclusion is that the latter approach may very well be better. 

6 Discrete event simulation using PL/I based general and special purpose sin 
Walter C. Metz 

January 1981 Proceedings of the 13th conference on Winter simulation 
f 81 

Publisher: IEEE Press 

Full text available: ^ pdf(650. 1 7 Additional Information: full citation , abst 

KB) citings , index ten 

This paper describes the architecture and language features of a simulate 
developed using a new IBM discrete event simulation package based on 
contains implementations of both the GPSS and SIMPL/I simulation lan 
addition provides the capability for a model developer to create special p 
languages tailored to his unique simulation application. The model descr 
simulates a retail or supermarket store point-of-s ... 

7 Multilingual text processing in a two-byte code 
Lloyd B. Anderson 

July 1984 Proceedings of the 22nd annual meeting on Association for C 
Linguistics , Proceedings of the 10th international conferenci 
Computational linguistics 

Publisher: Association for Computational Linguistics 

Full text available: 18 pdfT368.42 

Publisher Additional Information: full citation , abst 
Site 

National and international standards committees are now discussing a tw 
multilingual information processing. This provides for 65,536 separate c 
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codes, enough to make permanent code assignments for all the character 
alphabets of the world, and also to include Chinese/Japanese characters/ 
the kinds of flexibility required to handle both Roman and non-Roman a 
crucial to separate information units (codes) from gr ... 

8 Compiler code transformations for superscalar-based high performance sys 
S. A. Mahlke, W. Y. Chen, J. C. Gyllenhaal, W.-M. W. Hwu 
December 1992 Proceedings of the 1992 ACM/IEEE conference on Sup 

Supercomputing '92 
Publisher: IEEE Computer Society Press 

Full text available: ^pdf(1.05 Additional Information: full citation , refe: 

MB) index terms 



Interactive conversion of sequential to multitasking FORTRAN 
Kevin Smith, Bill Appelbe 

June 1989 Proceedings of the 3rd international conference on Supercoi 
Publisher: ACM Press 

Full text available: ^pdf(972.31 Additional Information: full citation , abst 

KB) citings , index ten 

Fully automated compilation of sequential Fortran to efficient multitaski 
impractical; tools need to be developed to aid users in interactively conv 
multitasking Fortran. This paper reports on experience using an interact! 
Assistant Tool (PAT) to convert sequential Fortran applications (ranging 
benchmarks to large application programs) to Cray microtasking Fortran 
advantages and limitations of interactive paralleliz ... 

10 HARE: an optimizing portable compiler for Scheme 
Dan Teodosiu 

January 1991 ACM SIGPLAN Notices, Volume 26 Issue 1 
Publisher: ACM Press 

Full text available: ^pdf(872.48 A , . . , . . „ . . 

jyjj Additional Information: full citation , abst 

A highly optimizing Scheme compiler called HARE is presented. A com 
optimization techniques allows for the generation of very efficient code, 
the compiler has been achieved through the use of a virtual machine as a 
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generation. The compiler will be used as a test-bed for fine-tuning the in 
symbolic architecture, the S-Machine. 

11 Software pipelining loops with conditional branches 
Mark G. Stoodley, Corinna G. Lee 

December 1996 Proceedings of the 29th annual ACM/IEEE internation 

Microarchitecture MICRO 29 
Publisher: IEEE Computer Society 

Full text available: 1 ^pdf(1.64 Additional Information: full citation , abst 

MB) citings , index ten 

Software pipelining is an aggressive scheduling technique that generates 
loops and is particularly effective for VLIW architectures. Few software 
algorithms, however, are able to efficiently schedule loops that contain c 
We have developed an algorithm we call All Paths Pipelining (APP) thai 
shortcoming of software pipelining. APP is designed to achieve optimal 
performance for any run of iterations while providing ef ... 

12 Design decisions influencing the microarchitecture for a Prolog machine 
<g>. T. P. Dobry, Y. N. Part, A. M. Despain 

December 1984 ACM SIGMICRO Newsletter , Proceedings of the 17th 
on Microprogramming MICRO 17, Volume 15 Issue 4 
Publisher: IEEE Press, ACM Press 

Full text available: ^pdf(1.27 Additional Information: full citation , abst 

MB) citings , index ten 

The PLM-1 is the first step in the hardware implementation of a heterog* 
processor for logic programming. This paper describes its ISP architectu 
detail some of the design decisions relative to its microarchitecture. 

13 Trace-driven memory simulation: a survey 
^ Richard A. Uhlig, Trevor N. Mudge 

June 1997 ACM Computing Surveys (CSUR), Volume 29 Issue 2 
Publisher: ACM Press 

Full text available: 1 3 pdf(636. 1 1 Additional Information: full citation , abst 

KB) citings , index ten 

As the gap between processor and memory speeds continues to widen, n 
evaluating memory system designs before they are implemented in hard^ 
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increasingly important. One such method, trace-driven memory simulati 
subject of intense interest among researchers and has, as a result, enjoye< 
and substantial improvements during the past decade. This article survey 
developments by establishing criteria for evaluating trac ... 

Keywords: TLBs, caches, memory management, memory simulation, tr; 
simulation 



14 Techniques for efficient inline tracing on a shared-memory multiprocessor 
^ S. J. Eggers, David R. Keppel, Eric J. Koldinger, Henry M. Levy 

April 1990 ACM SIGMETRICS Performance Evaluation Review , Pro 
1990 ACM SIGMETRICS conference on Measurement and 
computer systems SIGMETRICS '90, Volume 18 Issue 1 

Publisher: ACM Press 

Full text available: ^pdfO.12 Additional Information: full citation , abst 

MB) citings , index ten 

While much current research concerns multiprocessor design, few traces 
programs are available for analyzing the effect of design trade-offs. Exis 
methods have serious drawbacks: trap-driven methods often slow down ] 
by more than 1000 times, significantly perturbing program behavior; mi< 
modification is faster, but the technique is neither general nor portable. 1 
a new tool, called MPTRACE, for collecting tr ... 

15 Implementing functional languages in the Categorical Abstract Machine 
Michel Mauny, Ascander Suarez 

August 1986 Proceedings of the 1986 ACM conference on LISP and fui 

programming LFP '86 
Publisher: ACM Press 

Full text available: 1 Bpdf(687.85 A . rj . 1T , 

Additional Information: full citation , refe 



16 A Fortran preprocessor for the large program environment 
<^ Neal R. Wagner 

December 1980 ACM SIGPLAN Notices, Volume 15 Issue 12 



http://portal.acm.org/results.cfm?CFID=13590200&CFTOKEN=853... 2/5/07 



Results (page 1): +code +expansion +I/O 



Page 7 of 9 



Publisher: ACM Press 

Full text available:* ^ pdf(902.71 A .... lxr - „ u 

Additional Information: full citation , abst 

The use of a preprocessor to aid structured programming in Fortran has I 
discussed. This article considers a design philosophy which is especially 
large program development and maintenance. The design is distinguishe 
the form of the original source program in the standard Fortran output b) 
A specific implementation is described. 

17 A survey of resource allocation methods in optimizing microcode compilei 
Robert A. Mueller, Michael R. Duda, Stephen M. O'Haire 
December 1984 ACM SIGMICRO Newsletter , Proceedings of the 17th 

on Microprogramming MICRO 17, Volume 15 Issue 4 
Publisher: IEEE Press, ACM Press 

Full text available: H pdf(887. 10 Additional Information: full citation , abst 

KB) index terms 

This paper surveys results reported on resource allocation in optimizing : 
compilers. Resource allocation is the phase of microcode generation that 
operators of program text to machine registers and functional units. The 
results on resource allocation in optimizing microcode compilers were re 
and subsequent results were reported by Kim and Tan and by Ma and Le 
each of these methods, focusing on th ... 

18 The Java syntactic extender (JSE) 
Jonthan Bachrach, Keith Playford 

October 2001 ACM SIGPLAN Notices , Proceedings of the 16th ACM I 
conference on Object oriented programming, systems, lai 
applications OOPSLA '01, Volume 36 Issue 1 1 
Publisher: ACM Press 

Full text available: 1 1 pdfO 98.1 1 Additional Information: full citation , abst 

KB) citings , index ten 

The ability to extend a language with new syntactic forms is a powerful i 
flexible macro system allows programmers to build from a common bas< 
language designed specifically for their problem domain. However, mac 
integrated, capable, and at the same time simple enough to be widely use 
to the Lisp family of languages to date. In this paper we introduce a mac 
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the Java Syntactic Extender (JSE), with the superio ... 

19 An efficient variable-cost maze router 
Robert K. Korn 

January 1982 Proceedings of the 19th conference on Design automation 
Publisher: IEEE Press 

Full text available: 1 8 pdfif 554.89 Additional Information: full citation , abst 

KB) citings , index ten 

A variable cost maze router is described. The router is substantially faste 
maze routers and also provides a flexibility which is valuable in a variety 
particularly well suited for use on multiple layer routing surfaces in whic 
have primary wire directions which are perpendicular to each other. The 
incorporated as a final phase into both a circuit board routing system anc 
router. Experience with these systems ... 

20 MIL primitives for querying a fragmented world 
Peter A. Boncz, Martin L. Kersten 

October 1999 The VLDB Journal — The International Journal on Ver 

Bases, Volume 8 Issue 2 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ^ pdfi(26 1.3 6 Additional Information: full citation , abst 

KB) terms 

In query-intensive database application areas, like decision support and ( 
that use vertical fragmentation have a significant performance advantage 
relational or object oriented applications on top of such a fragmented dat 
yet powerful intermediate language is needed. This problem has been su< 
Monet, a modern extensible database kernel developed by our group. W< 
design choices made in the Monet interprete ... 

Keywords: Database systems, Main-memory techniques, Query languag 
optimization, Vertical fragmentation 
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1 Inline function expansion for compiling C programs 
<|> P. P. Chang, W.-W. Hwu 

June 1989 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 7 

Publisher: ACM Press 

Full text available: ^pdf(1.14 Additional Information: full citation , abst 

MB) citings , index ten 

Inline function expansion replaces a function call with the function body 
inline function expansion, programs can be constructed with many small 
complexity and then rely on the compilation to eliminate most of the fun 
Therefore, inline expansion serves a tool for satisfying two conflicting g« 
complexity of the program development and minimizing the function ca] 
program execution. A simple inline expansion procedur ... 

2 A comparative study of static and profile-based heuristics for inlining 
<^ Matthew Arnold, Stephen Fink, Vivek Sarkar, Peter F. Sweeney 

January 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGP 
Dynamic and adaptive compilation and optimization DYT 
Volume 35 Issue 7 
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Publisher: ACM Press 

Full text available: 18 pdf0.l3 Additional Information: full citation , abst 

MB) citings , index ten 

In this paper, we present a comparative study of static and profile-based 
inlining. Our motivation for this study is to use the results to design the t 
algorithm that we can for the Jalapeno dynamic optimizing compiler for 
well-known approximation algorithm for the KNAPSACK problem as a 
algorithm" for the inlining heuristics studied in this paper. We present p< 
for an implementation of these inlinin ... 

3 Practical virtual method call resolution for Java 

^ Vijay Sundaresan, Laurie Hendren, Chrislain Razafimahefa, Raja Vallee-R 
Etienne Gagnon, Charles Godin 

October 2000 ACM SIGPLAN Notices , Proceedings of the 15th ACM ! 

conference on Object-oriented programming, systems, la 
applications OOPSLA f 00, Volume 35 Issue 10 

Publisher: ACM Press 

Full text available: ^ pdf(323. 98 Additional Information: full citation , abst 

KB) citings , index ten 

This paper addresses the problem of resolving virtual method and interfa 
bytecode. The main focus is on a new practical technique that can be use 
applications. Our fundamental design goal was to develop a technique th 
with only one iteration, and thus scales linearly with the size of the progi 
same time providing more accurate results than two popular existing line 
hierarchy analysis and rapid type an ... 

4 On the conversion of indirect to direct recursion 
<^ Owen Kaser, C. R. Ramakrishnan, Shaunak Pawagi 

March 1993 ACM Letters on Programming Languages and Systems (L 

2 Issue 1-4 
Publisher: ACM Press 

Full text available: 18 pdff 929.68 Additional Information: full citation , abst 

KB) citings , index ten 

Procedure inlining can be used to convert mutual recursion to direct recu 
use of optimization techniques that are most easily applied to directly re< 
in addition to the well-known benefits of inlining. We present tight (nec« 
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sufficient) conditions under which inlining can transform all mutual recu 
recursion, and those under which heuristics to eliminate mutual recursioi 
We also present a technique ... 

Keywords: call graphs, inline substitution, mutual recursion, procedure : 



5 Dynamic Adaptive compilation: Adaptive online context-sensitive inlining 
Kim Hazelwood, David Grove 

March 2003 Proceedings of the international symposium on Code genei 
optimization: feedback-directed and runtime optimization 
Publisher: IEEE Computer Society 

Full text available: ^pdf(1.06 Additional Information: full citation , abst 
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As current trends in software development move toward more complex ( 
programming, inlining has become a vital optimization that provides sub 
performance improvements to C++ and Java programs. Yet, the aggressi 
inlining algorithm must be carefully monitored to effectively balance pei 
size. The state-of-the-art is to use profile information (associated with ca 
inlining decisions. In the presence of virtual method calls, profile ... 

6 Sealed calls in Java packages 

Ayal Zaks, Vitaly Feldman, Nava Aizikowitz 

October 2000 ACM SIGPLAN Notices , Proceedings of the 15th ACM J 
conference on Object-oriented programming, systems, la 
applications OOPSLA '00, Volume 35 Issue 10 

Publisher: ACM Press 

Full text available: S pdfQ 92.57 Additional Information: full citation , abst 
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Determining the potential targets of virtual method invocations is essenti 
procedural optimizations of object-oriented programs. It is generally har< 
targets accurately. The problem is especially difficult for dynamic langu; 
because additional targets of virtual calls may appear at runtime. Currenl 
enable inter-procedural optimizations for dynamic languages, repeatedly 
optimizations at runtime. This paper addresses this ... 

Keywords: Java, call devitalization, call graph, class hierarchy graph, 
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on Programming language design and implementation PLD) 

Issue 5 
Publisher: ACM Press 

Full text available : " B pdf( 1 .40 Additional Information: full citation , abst 

MB) citings , index ten 

Existing research understates the benefits that can be obtained from inlin 
especially when guided by profile information. Our implementation of ir 
yields excellent results on average and very rarely lowers performance. ^ 
results can be explained by a number of factors: inlining at the intermedi 
removes most technical restrictions on what can be inlined; the ability to 
and incorporate profile information enables ... 

8 Automatic pool allocation for disjoint data structures 
^ Chris Lattner, Vikram Adve 

June 2002 ACM SIGPLAN Notices , Proceedings of the 2002 workshop 
system performance MSP f 02, Volume 38 Issue 2 supplement 
Publisher: ACM Press 

Full text available: * Bpdf(1.48 Additional Information: full citation , abst 
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This paper presents an analysis technique and a novel program transforn 
enable powerful optimizations for entire linked data structures. The fully 
transformation converts ordinary programs to use pool (aka region) alloc 
based data structures. The transformation relies on an efficient link-time 
analysis to identify disjoint data structures in the program, to check whel 
structures are accessed in a type-safe manner, and to constru ... 

9 Partitioning sequential programs for CAD using a three-step approach 
<^ Frank Vahid 

July 2002 ACM Transactions on Design Automation of Electronic Syst 

Volume 7 Issue 3 
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Many computer-aided design problems involve solutions that require the 
large sequential program written in a language such as C or VHDL. Sucl 
improve design metrics such as performance, power, energy, size, input/ 
even CAD tool run-time and memory requirements, by partitioning amoi 
modules, hardware and software processors, or even among time-slices i 
computing devices. Previous partitioning approaches typically presel ... 

Keywords: Partitioning, behavioral partitioning, functional partitioning, 
partitioning, system level partitioning 
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<^ Suresh Jagannathan, Andrew Wright 
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A flow-directed inlining strategy uses information derived from control-: 
specialize and inline procedures for functional and object-oriented langu 
control-flow analysis to identify candidate call sites, flow-directed inlinii 
procedures whose relationships to their call sites are not apparent. For in 
defined in other modules, passed as arguments, returned as values, or exi 
structures can all be inlined. Flow-d ... 

11 Online feedback-directed optimization of Java 

^ Matthew Arnold, Michael Hind, Barbara G. Ryder 

November 2002 ACM SIGPLAN Notices , Proceedings of the 17th AO 
conference on Object-oriented programming, systems, 
applications OOPSLA '02, Volume 37 Issue 1 1 

Publisher: ACM Press 
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This paper describes the implementation of an online feedback-directed ■ 
system. The system is fully automatic; it requires no prior (offline) profi] 
previously developed low-overhead instrumentation sampling framewor] 
flow graph edge profiles. This profile information is used to drive severa 
optimizations, as well as a novel algorithm for performing feedback-dire 
graph node splitting. We empirically evaluate this syst ... 

Keywords: adaptive optimization, dynamic optimization, online algoritr 
machines 
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The existence of statically detectable correlation among conditional bran 
elimination, an optimization that has a number of benefits. This paper pr 
determine whether an interprocedural execution path leading to a conditi 
along which the branch outcome is known at compile time, and then to e 
along this path through code restructuring. The technique consists of a di 
interprocedural analysis that determines whethe ... 

13 Unexpected side effects of inline substitution: a case study 
<g> Keith D. Cooper, Mary W. Hall, Linda Torczon 

March 1992 ACM Letters on Programming Languages and Systems (L 
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The structure of a program can encode implicit information that changes 
speed of the generated code. Interprocedural transformations like inlinin; 
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information; using interprocedural data-flow information as a basis for o 
have the same effect. In the course of a study on inline substitution with 
FORTRAN compilers, we encountered unexpected performance problen 
programs. This paper describes the specific ... 

Keywords: inline substitution, interprocedural analysis, interprocedural 
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^ Michael G. Burke, Jong-Deok Choi, Stephen Fink, David Grove, Michael ] 
Mauricio J. Serrano, V. C. Sreedhar, Harini Srinivasan, John Whaley 
June 1999 Proceedings of the ACM 1999 conference on Java Grande «L 
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15 Polymorphic splitting: an effective polyvariant flow analysis 
^ Andrew K. Wright, Suresh Jagannathan 

January 1998 ACM Transactions on Programming Languages and Sysl 
Volume 20 Issue 1 

Publisher: ACM Press 

Full text available: " B pdf(5 1 7.76 Additional Information: full citation , abst 
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This article describes a general-purpose program analysis that computes 
and data-flow information for higher-order, call-by-value languages. Th< 
novel form of polyvariance called polymorhic splitting that uses let-expr 
clues to gain precision. The information derived from the analysis is usei 
run-time checks and to inline procedure. The analysis and optimizations 
to a suite of Scheme progra ... 

Keywords: flow analysis, inlining, polyvariance, run-time checks 
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Large complex programs are composed of many small routines that imp] 
for the routines that call them. To be useful, an execution profiler must a 
time in a way that is significant for the logical structure of a program as 
textual decomposition. This data must then be displayed to the user in a < 
informative way. The gprof profiler accounts for the running time of call 
running time of the routines ... 

17 Static conflict analysis for multi-threaded object-oriented programs 
^ Christoph von Praun, Thomas R. Gross 

May 2003 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 5 
Publisher: ACM Press 
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A compiler for multi-threaded object-oriented programs needs informati< 
of objects for a variety of reasons: to implement optimizations, to issue \ 
instrumentation to detect access violations that occur at runtime. An Obj 
(OUG) statically captures accesses from different threads to objects. An 
Heap Shape Graph (HSG), which is a compile-time abstraction for runtii 
and their reference relations (edges). An OUG specifie ... 

Keywords: heap shape graph, object use graph, program analysis, race d 
representations for concurrent programs 
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We address the problem of register optimization that arises during high-] 
modular hierarchical behavioral specifications. Register optimization is 1 
grouping carriers such that each group can be safely allocated to a hardw 
register optimization by inline expansion involves flattening the module 
a heuristic register optimization procedure on the flattened description. / 
expansion yields a near-optimal number of... 

Keywords: Behavioral synthesis, hardware description languages, hierai 
specifications, high-level synthesis, lifecycle analysis, register optimizat: 



19 A framework for call graph construction algorithms 
<^ David Grove, Craig Chambers 
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A large number of call graph construction algorithms for object-oriented 
languages have been proposed, each embodying different tradeoffs betw 
and call graph precision. In this article we present a unifying framework 
call graph construction algorithms and an empirical comparison of a repi 
algorithms. We first present a general parameterized algorithm that enco 
known and novel call graph construction algorithms. W ... 

Keywords: Call graph construction, control flow analysis, interprocedur 
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Reducing application size is important for software that is distributed viz 
order to keep download times manageable, and in the domain of embedd 
applications are often stored in (Read-Only or Flash) memory. This pape 
extraction techniques such as the removal of unreachable methods and n 
inlining of method calls, and transformation of the class hierarchy for re< 
size. We implemented a number of extraction techniques in < ... 

Keywords: Application extraction, call graph construction, class hierarc 
packaging, whole-program analysis 
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Architectures and Compilation Techniques PACT '04 
Publisher: IEEE Computer Society 

Full text available: ^ pdf(241 .65 
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Performing inlining of routines across file boundaries is known to yield ! 
performance improvements. In this paper, we present a scalable cross-rm 
framework that reduces the compiler's memory footprint, file thrashing, ; 
time. Instead of using the call-site ordering generated by the analysis phj 
transformation phase dynamically produces a new inlining order depend 
constraints of the system. We introduce dependences among ... 



2 Using annotations to reduce dynamic optimization time 
<^ Chandra Krintz, Brad Calder 
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Dynamic compilation and optimization are widely used in heterogenous 
environments, in which an intermediate form of the code is compiled to t 
execution. An important trade off exists between the amount of time spen 
optimizing the program and the running time of the program. The time tt 
optimizations can cause significant delays during execution and also pre 
gains that result from more complex optimization. 

3 Compiler analysis and optimization: Providing time- and space- efficient p 
^ asynchronous software thread integration 
Vasanth Asokan, Alexander G. Dean 

September 2004 Proceedings of the 2004 international conference on G 

architecture, and synthesis for embedded systems CAS 
Publisher: ACM Press 

Full text available: ^ pdfi(289.56 Additional Information: full citation , abst 
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Asynchronous Software Thread Integration (ASTI) provides fine-grain c 
time threads by statically scheduling (integrating) code from primary thr 
threads, reducing the context switching needed and allowing recovery of 
time. Unlike STI, ASTI allows asynchronous thread progress.Current AS 
not support procedure calls in the secondary thread because they lead to 
during static scheduling. ASTI requires knowing the sec ... 

Keywords: asynchronous software thread integration, fine-grain concun 
software migration, software-implemented communication protocol conl 
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Object-oriented languages such as Java and Smalltalk provide a uniform 
model, allowing objects to be conveniently shared. If implemented direc 
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reference models can suffer in efficiency due to additional memory dere: 
memory management operations. Automatic inline allocation of child ot 
objects can reduce overheads of heap-allocated pointer-referenced objeci 
compiler analyses to identify inlinable fields by t ... 
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While much current research concerns multiprocessor design, few traces 
programs are available for analyzing the effect of design trade-offs. Exis 
methods have serious drawbacks: trap-driven methods often slow down ] 
by more than 1000 times, significantly perturbing program behavior; mi( 
modification is faster, but the technique is neither general nor portable. 1 
a new tool, called MPTRACE, for collecting tr . .. 

6 Exploiting the non-determinism and asynchrony of set iterators to reduce a 
^ latency 

David C. Steere 
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The Tensor Contraction Engine (TCE) is a domain-specific compiler for 
complex tensor contraction expressions arising in quantum chemistry ap 
electronic structure. This paper develops a performance model for tensoi 
considering both disk I/O as well as inter-processor communication cost: 
performance-model driven loop optimization for this domain. Experimei 
provided that demonstrate the accuracy and effectiveness of the mod ... 

Keywords: compiler optimization, out-of-core algorithms, parallel algor 
modeling 
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We evaluate the performance of a user-space Direct Access File System 
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Oracle Disk Manager (ODM) client using two synthetic test codes as we 
database. Tests were run on 4-processor Intel Xeon-based systems runnii 
The systems were connected with ServerNet II, a Virtual Interface Archi 
compliant system area network. We compare the performance of DAFS/ 
based I/O, measuring I/O bandwidth and latency. We also compare the r 

Keywords: DAFS, Database, File Systems, I/O, Networks, Performance 
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This paper gives a brief introduction to the application development envi 
DECmpp 12000 Massively Parallel Computer. Specifically, the architecl 
system and compilers are discussed. 
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Functional partitioning assigns the functions of a system's program-like ; 
system components, such as standard-software and custom-hardware pre 
introduce a new transformation, called procedure cloning, that significaii 
functional partitioning results. The transformation creates a clone of a pr 
by a particular procedure caller, so the clone can be assigned to the calle: 
in turn improves performance through reduced ... 

Keywords: behavioral synthesis, embedded systems, functional partitior 
hardware/software codesign, replication, system-level design, system-on 
transformations 



13 SYZYGY - A Framework for Scalable Cross-Module IPO 

Sungdo Moon, Xinliang D. Li, Robert Hundt, Dhruva R. Chakrabarti, Luis 
Srinivasan, Shin-Ming Liu 

March 2004 Proceedings of the international symposium on Code genei 
optimization: feedback-directed and runtime optimization 
Publisher: IEEE Computer Society 

Full text available: t B pdff 198.14 A 1T , 

K ^ Additional Information: full citation , abst 

Performing analysis across module boundariesfor an entire program is in 
exploitingseveral runtime performance opportunities. However,due to sc 
in existing full-programanalysis frameworks, such performance opportui 
realized by paying tremendous compile-timecosts. Alternative solutions, 
partialcompilations or user assertions, are complicated orunsafe and as a 
commercialapplications are compiled today with cross-moduleoptimizat: 

14 Space and time-efficient memory layout for multiple inheritance 
^ Peter F. Sweeney, Joseph (Yossi) Gil 

October 1999 ACM SIGPLAN Notices , Proceedings of the 14th ACM I 
conference on Object-oriented programming, systems, la 
applications OOPSLA '99, Volume 34 Issue 10 

Publisher: ACM Press 

Full text available: ^pdf(2.30 Additional Information: full citation , abst 



http://portal.acm.org/results.cfm?coll=ACM&dl=ACM&CFID=135... 2/5/07 



Results (page 1): +inlin* + reduc* +1/0 



Page 7 of 9 



MB) citings , index ten 

Traditional implementations of multiple inheritance bring about not only 
terms of run-time but also a significant increase in object space. For exai 
compiler-generated fields in a certain object can be as large as quadratic 
subobjects. The problem of efficient object layout is compounded by the 
different semantics of multiple inheritance: shared, in which a base class 
distinct ... 
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Practical implementations of real languages are often an excellent way o 
applicability of theoretical principles. Many stresses and strains arise fro 
practicalities, such as performance and standard compatibility, to theoret 
methods. These stresses and strains are valuable sources of new research 
as an oft-needed check on the egos of theoreticians. Two fertile areas tha 
by implementations are 
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We describe a parallel, real-time garbage collector and present experime 
demonstrate good scalability and good real-time bounds. The collector is 
shared-memory multiprocessors and is based on an earlier collector algo 
provided fixed bounds on the time any thread must pause for collection, 
earlier algorithm was designed for simple analysis, it had some impractic 
paper presents the extensions necessary for a pract ... 
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Java, an object-oriented language, uses virtual methods to support the ex 
classes. Unfortunately, virtual method calls affect performance and thus 
implementation, especially when just-in-time (JIT) compilation is done. 
type feedback are solutions used by compilers for dynamically-typed obj 
languages such as SELF [1, 2, 3], where virtual call overheads are much 
performance than in Java. Wi ... 
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Memory hierarchy performance has always been an important issue in c< 
design. The likelihood of a bottleneck in the memory hierarchy is increa: 
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improvements in microprocessor performance continue to outpace those 
memory system. As a result, effective utilization of cache memories is ei 
architectures.The nature of procedural software poses visibility problem* 
perform program optimization. One approach to increasing visibil ... 
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In an attempt to reduce the number of operand memory references, man) 
have thirty-two or more general-purpose registers (e.g., MIPS, ARM, Sp 
Without special compiler optimizations, such as inlining or interprocedu 
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